Investigating Bilingual Deep Neural Networks for Automatic Recognition of Code-switching Frisian Speech

نویسندگان

  • Emre Yilmaz
  • Henk van den Heuvel
  • David A. van Leeuwen
چکیده

In this paper, a code-switching automatic speech recognition (ASR) system built for the Frisian language is described. Frisian is mostly spoken in the province Fryslân which is located in the north of the Netherlands. The native speakers of Frisian are mostly bilingual and often code-switch in daily conversations due to the extensive influence of the Dutch language. In the scope of the FAME! Project, the influence of this unforeseen language switching on modern ASR systems will be investigated with the objective of building a robust recognizer that can handle this phenomenon. For this purpose, in this work, we design a bilingual deep neural network (DNN)-based ASR system and investigate the impact of bilingual DNN training in the context of code-switching speech. c © 2016 The Authors. Published by Elsevier B.V. Peer-review under responsibility of the Organizing Committee of SLTU 2016.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Exploiting Untranscribed Broadcast Data for Improved Code-Switching Detection

We have recently presented an automatic speech recognition (ASR) system operating on Frisian-Dutch code-switched speech. This type of speech requires careful handling of unexpected language switches that may occur in a single utterance. In this paper, we extend this work by using some raw broadcast data to improve multilingually trained deep neural networks (DNN) that have been trained on 11.5 ...

متن کامل

Open Source Speech and Language Resources for Frisian

In this paper, we present several open source speech and language resources for the under-resourced Frisian language. Frisian is mostly spoken in the province of Fryslân which is located in the north of the Netherlands. The native speakers of Frisian are Frisian-Dutch bilingual and often code-switch in daily conversations. The resources presented in this paper include a code-switching speech da...

متن کامل

A Longitudinal Bilingual Frisian-Dutch Radio Broadcast Database Designed for Code-Switching Research

We present a new speech database containing 18.5 hours of annotated radio broadcasts in the Frisian language. Frisian is mostly spoken in the province Fryslân and it is the second official language of the Netherlands. The recordings are collected from the archives of Omrop Fryslân, the regional public broadcaster of the province Fryslân. The database covers almost a 50-year time span. The nativ...

متن کامل

Development of bilingual ASR system for MediaParl corpus

The development of an Automatic Speech Recognition (ASR) system for the bilingual MediaParl corpus is challenging for several reasons: (1) reverberant recordings, (2) accented speech, and (3) no prior information about the language. In that context, we employ frequency domain linear prediction-based (FDLP) features to reduce the effect of reverberation, exploit bilingual deep neural networks ap...

متن کامل

Addressing Code-Switching in French/Algerian Arabic Speech

This study focuses on code-switching (CS) in French/Algerian Arabic bilingual communities and investigates how speech technologies, such as automatic data partitioning, language identification and automatic speech recognition (ASR) can serve to analyze and classify this type of bilingual speech. A preliminary study carried out using a corpus of Maghrebian broadcast data revealed a relatively hi...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2016